Valuable Datasets Being Stored through Filecoin Slingshot
Over the last two weeks, the Filecoin community has stored more than 170 TiB of valuable data through more than 200,000 storage deals on the Filecoin network. This data is being stored by participants in the Slingshot competition, a community competition for storage clients and developers to onboard valuable data to the Filecoin network.
Collectively, we have seen teams create Dropbox-style storage apps, games, and UIs around a variety of datasets – scientific, cultural, entertainment, software, and more! Let’s dive into some of the most interesting datasets we’ve seen so far in this competition.
Dataset: Optical fiber data
Data Size: ~50-100 TiB
Smartcity is building a sensor-based network and data analysis system that uses data generated by previously laid optical fiber cables to monitor temperature, pressure, vibration, sound and other information around the optical fiber in cities all over the world. This data can be used by researchers, municipal managers, and traffic facility supervisors to assist in urban planning and construction. It can also be used to build predictive monitoring systems, and in some cases, even helping to predict earthquakes. These cables act like the nervous system for cities. The Smartcity project is harnessing this generated data to help cities run more efficiently.
This data from optical fiber sensor networks is stored on Filecoin and IPFS. Deep learning models will be trained on this data and will eventually be used to analyze various signal types and perform pattern recognition to classify and predict events. The team has created a webpage to play sounds collected from these sensors, as well as a query interface to send these files from IPFS and Filecoin to remote web servers that perform the deep learning processing.
Dataset: Virtual Reality Data
Data Size: ~50 TiB
Zangshell is a virtual reality (VR) and augmented reality (AR) system that has collected original video from specialty cameras, then processed and stored the data. The data was originally used to serve the real estate and municipal engineering industries. Previously this data had been stored in a centralized data center, but the team is migrating the data to the IPFS and Filecoin networks to save on storage costs. The Zangshell team intends to migrate their storage and data processing pipelines directly to IPFS and Filecoin moving forward.
Zangshell users will be able to watch these VR/AR images or videos directly in their web browsers. They can zoom in or zoom out of a scene, and also view virtual reality scenes from different angles by using a PC mouse or gestures on a mobile screen.
Team: Starry Sky in Yunnan
Dataset: Astronomical and meteorological data
Data Size: ~200TiB
The Starry Sky in Yunnan team is storing a repository of astronomical and weather data, including charts, analysis, and photos. This data is generated, collected, and stored by Yunnan University in Kunming, China. The Starry Sky in Yunnan team has recently collaborated with the university on a few different projects, and had talked about storing this data on Filecoin. The announcement of Slingshot gave them a good reason to kick off this new project.
The team believes this data can be useful for future generations, and believes that Filecoin’s mission of storing humanity’s most important information makes the Filecoin storage network a good storage solution for this data. They plan on onboarding members of the University and other organizations dealing with astronomical and weather data as their first users.
Other Valuable Datasets
In addition to the datasets Slingshot teams are storing for their specific use cases, there are a multitude of other important and publicly available datasets that Slingshot teams are encouraged to store. These datasets have the potential to provide critical medical, scientific and cultural information to users around the world. Some of these datasets include:
- Space Data: The Sloan Digital Sky Survey (SDSS) has created detailed three-dimensional maps of the universe, featuring deep multi-color images of one third of the sky, and spectra for more than three million astronomical objects. The SDSS has been working for more than 20 years to make a map of the universe, and continues to add data. New discoveries have already been made using this data, including the discovery of new types of quasars, the discovery of a new population of “ultra-faint” dwarf galaxies orbiting the Milky Way, and data on “hypervelocity stars,” which passed too close to our Galaxy’s central black hole and are now moving so fast that they will escape from the Milky Way entirely.
- COVID-19 Data: The US government and a coalition of leading research groups have prepared the public COVID-19 Open Research Dataset (CORD-19) – over 200,000 scholarly articles about COVID-19, SARS-CoV-2, and related coronaviruses. This dataset will allow the global research community to generate new insights to support the fight against COVID-19 and prevent future pandemics.
- Museum Data: In an effort to improve government data quality and availability for educational, personal and commercial use, the National Palace Museum of Taiwan established a public “Database Search” and “Image Downloads” repository in 2015 to make the museum’s images and research materials more accessible to the public. The data includes exhibition information as well as images of museum artifacts, including paintings, antiquities, clothing, inscriptions, and calligraphy.
Every day, our world produces exabytes of data. Much of this data has the potential to help answer some of humanity’s most critical questions. We believe the Filecoin network will play a valuable role in addressing these big questions by ensuring that our species’ data is preserved and accessible to future generations. We’re excited to see Slingshot participants working together to help the network achieve this potential.