Why do we produce snow data for Alpine Solar Projects?
When it comes to Alpine Solar farms, snow is a double edged sword. Snow can increase energy yields through increased albedo and panel cleaning, but can also reduce yields during periods of panel coverage or burial.
Over the last year, Wegaw has been providing snow datasets for Alpine solar project planning and energy production forecasting.
High-resolution snow height maps help identify areas with minimal or excessive snow accumulation, allowing for the optimal placement of solar panels.
In this blog post we aim to cover how Wegaw has used Deep Learning, in combination with multiple datasets to create both near-real time, and historical archives of high resolution snow height.
Capturing the Reality
Highly accurate HS maps can be derived through the subtraction of a ‘snow-on’ Digital Surface Model (DSM) from a ‘snow off’ DSM. These DSMs are usually acquired via photogrammetry and terrestrial laser scanning using Unmanned Aerial Vehicles (UAV).
Photogrammetric methods apply a Structure-from-Motion (SfM) algorithm to a series of overlapping optical images from different angles to create a 3D point cloud. This is a methodology often used by the WSL Institute for Snow and Avalanche Research (SLF) to contribute to their growing database of spatially distributed Snow Height.
Terrestrial Laser Scanning (TLS) involves Light Detecting And Ranging (LiDAR) equipped UAVs that emit light pulses and measure the time it takes for the pulse to bounce back from the surface. As the speed of light is known, the distance can be calculated from the known position of the UAV to calculate the elevation of the earth’s surface. This is a method used by the NASA/JPL Airborne Snow Observatory (ASO) aircraft survey campaigns, who have generated a large dataset of 3m resolution gridded snow height maps across North America.
Obtaining UAV data demands significant resources, making it an impractical option for obtaining cost-effective, frequent updates. Moreover, if the data collection window is missed, historical records from past seasons cannot be retrospectively acquired.
Bridging the Gap: The Wegaw Solution
Wegaw produces daily snow cover extent (SCE) snow height (HS), and snow water equivalent (SWE) maps at resolutions up to 2m.
The Wegaw snow system utilises Machine Learning (ML)-based fusion of various geospatial data sources, including but not limited to:
- In-situ Ultrasound HS measurements.
- In-situ GNSS-derived HS.
- Optical satellite imagery from ESA Copernicus Sentinel-2 & NASA/NOAA MODIS/VIIRS.
- Atmospheric data products such as the NOAA Global Forecast System GFS or ECMWF Reanalysis v5 (ERA5).
Depending on where in the world we are building snow data products, we can pull data from a range of sources. This could be the IMIS and Meteoswiss network in Switzerland, or the SNOTEL network in North West America. In areas of special importance, we install stations as part of our own station network of GNSS based snow measurement stations.
Fusion of Ground and Satellite Data
We use a generalised additive model (GAM) to interpolate between the snow height observed by the station networks and satellite data. GAM’s account for spatial autocorrelation and handle irregularly spaced data by smoothing over neighbouring locations.
For each day in a time-series, each station’s longitude, latitude, elevation, and aspect value are fed to the model as predictors for their corresponding target HS measurement. A portion of snow cover extent edge pixels are introduced to the training dataset as 0 HS values for the model to estimate the snow-line. We use the Leave-One-Out Cross-Validation (LOOCV) method to ensure that the optimal model configurations are set. Based on the learned relationships, the model is then used to predict the snow height at each 100m resolution grid cell of known longitude, latitude, aspect, and elevation.
This method enables us to generate country scale estimations of snow height. With some further modelling, we are also able to generate country scale estimations of water stored in snow, which has significant value to hydrological applications, particularly in the case of distributed inflow forecast models.
Finding the Snow Traps
Solar farms sites rarely extend over such large areas, so it is computationally far more efficient to compute the 2m snow height for a smaller area, after the initial 100m resolution model has been run.
The first step is to upsample the HS using a simple bilinear interpolation algorithm, however to correctly redistribute the HS estimations at a higher resolution, we need to think about what features can help us identify where snow may or may not accumulate.
We have tested the feature importance of a wide variety of geomorphological and satellite based data layers, and found this optimal combination:
- Topographic Position Index (TPI) describes the relative elevation of a target grid cell to the mean elevation of a stated region around the target grid cell. The computation is performed for all grid cells. TPI has a non-linear correlation with snow height, given that the concave surfaces trap snow, while the convex surfaces will represent blow-off zones.
- Aspect describes the angle of orientation of a land surface in relation to the North. Aspect has a non-linear correlation with snow height, given that accumulation and melt patterns of south facing slopes are generally more exposed to solar radiation.
- Slope describes the incline of a land surface at a specific point. Steep slopes are prone to snow blow-off, however the observed relationship between slope and snow height is complex due to the observed increase in snow with elevation, coupled with the increase in steep slopes with elevation.
- Snow Persistence Index (SPI) is the normalised persistence of snow cover over a given grid cell, derived as an average over a specified number of years between the beginning of the spring melt period, and the end of the summer period. This can be generated using the 7 year time-series from the Wegaw SCE product. Deeper snow takes longer to melt as a function of snow layer insulation, thermal inertia and conduction resistance, therefore we observe a strong positive linear correlation between the HS and the SPI.
A Deep Learning (DL) model is trained using the TPI, aspect, slope and SPI as predictive features, where the target variable is the number by which the interpolated HS should be multiplied by to obtain the snow height measured in the UAV data. A DL model architecture is well positioned for this task given their ability to approximate the multiple complex, non-linear relationships between the predictive features, and the expected accumulation of snow taking place at a specific point.
Each grid cell has a known TPI, aspect, slope, and SPI, which the model will use to estimate how much the interpolated snow height must be increased or decreased. The interpolated HS value is then multiplied by the model output to generate the high resolution snow height map.
Are we being fair?
A major challenge associated with using deep learning for environmental science tasks is the reduction of bias in training datasets. Generally, the UAV acquisitions are taken at the peak of the snow year which makes sense for their initial intended purpose, but this could result in a model that learns the distribution of snow as representative for the whole year. We also need to address spatial biases — For example, is most of our data collected in a specific region? Perhaps we have more data in the areas surround the research facilities than in further away less accessible regions. To mitigate this issue, we have representatively sampled the broader dataset based on the number of years into the hydrological year, and longitude and latitude. Additionally, we commissioned the SLF to perform a series of four flights in the months of February and May at two separate locations with contrasting conditions, either side of the peak of the snow year.
Understanding the results
We can evaluate the resulting output by comparing it to a corresponding photogrammetrically derived snow height product generated by the SLF that has not been seen by the model. In practice, we would test this result against a range of different flight dates and locations, rather than just one.
As we know, the SLF snow products have a very high accuracy (RMSE=15cm), which means they can be used as both training and benchmarking data. Almost 37 million Wegaw modelled data points were compared with SLF measured UAV measured data points.
The Root Mean Square Error (RMSE) measures the average difference between predicted and actual values. When we compare the Wegaw Snow Height to the SLF generated Snow Height, we find that the RMSE for Wegaw is 70cm. This value signifies the usual level of error we can anticipate at the 2-meter resolution pixel level. This is an important additional piece of information for decision makers to consider when using the Wegaw snow height data for site location and assessment.
The difference map shows that the Wegaw model is overestimating the snow height in higher elevations, and underestimate the snow height in lower elevations. This is most likely a result of too few data points in the interpolation stage of the computation. The introduction of additional measurement stations to the region would help to reduce this error, however this is not possible to do retrospectively.
When we zoom into the map and look at the detail, we can see the Wegaw model is able to approximate the general trends of snow height distribution at this level. However the is still profound levels of variability in snow height that can not be captured by a method of this simplicity. We would need to introduce additional features, such as dynamic wind data layers, and a broader range of geomorphological layers. We may need to use a more complex model architecture, perhaps one that can encode deep spatial relationships at multiple scales, or a model that can use self attention to consider and weigh the importance of interdependencies spatial features within and across different patches.
How is this data used for Alpine Solar Projects?
We typically provide our Alpine Solar clients with a historical daily 2m resolution snow height dataset that covers a 7 year period. From this dataset, it is then possible to drop a pin on a specific coordinate in the area of interest and retrieve the expected evolution of snow at that specific point as an average or a maximum over multiple years.
It is then possible to infer whether it’s a good idea to place the solar panels in the given location, and how high a given panel should be placed at a given point.
The graph below demonstrates that when a solar panel is placed within a specific point, at 3 meters above the ground, then it will spend 100 more days underneath the snow than if it was placed just 1 meter higher.
So with the aid of the Wegaw snow height product, an alpine solar provider will have prevented the loss of 100 days of solar production between the months of January and May.
Conclusion
By combining of automated in-situ data, geomorphology, satellite data, UAV’s, atmospheric data and modern DL methods, we have demonstrated how high resolution snow maps can be created to support alpine solar site location and assessment, and optimise the generation of solar projects.
There’s a very long way to go until we can reproduce the accuracy comparable to that of photogrammetry or LiDAR techniques, but being able to provide data retrospectively, cheaply and quickly has already proven valuable to the solar sector.
In the future the team at Wegaw are looking to incorporate the dynamic influences of wind driven snow sheltering patterns, and exploring the potential of more complex deep learning architectures to help increase the accuracy of our high resolution snow products.
Stay tuned for the results!