Abstract: When building a data lake, typically the use case is around managing large volumes of data from a wide variety of sources and formats. However, the concept of a data lake can apply to data of any scale. In this talk, Alex will walk through a project where he was receiving a couple gigabytes of data quarterly from a client where they needed access to the combined dataset within a day. He was able to build a serverless data lake that could process the data into a data warehouse and serve up high quality and curated data marts in a few minutes using a variety of AWS serverless services. The best part is the data lake and warehouse only costs about $5 a month to run.
Abstract: Water quality data is vital to assess both risk and remedy effectiveness in or near water bodies. Although individual manufacturers of water quality devices typically have software that allows for data downloading and visualizations, the integration of data from different sensors is left to technical scientists. To address this problem, we created the Multi-Sensor Data System (MSDS). This system is designed to automate the ingestion of varying file types into a data lake, process that data into warehouses and marts, and serve up the integrated results via interactive dashboards. Included in this system is the ability for user-defined quality assurance filters, with updates in real time. The processing time needed from raw data file upload to dashboard availability is approximately two minutes. This system was built using Amazon Web Services serverless technologies with a focus on low cost and maintenance, while maintaining fast response times and high reliability.
Abstract: Recent advances in technology have led to the collection of data at speeds and volumes exponentially greater than ever before. In parallel, data management tools have been developed that allow those that possess these tools to more easily gather, process, and interpret data. In this presentation, the MTSU Data Science Institute will demonstrate the concept and execution of their Multi-Sensor Data System (MSDS) using a water quality dataset to demonstrate functionality. This system was designed to automate the ingestion of varying files into a data lake, process into a data warehouse and marts, and serve up the results via a dashboard in minutes. This system was built using AWS serverless technologies with a focus on low cost and maintenance, with high performance and reliability.
Abstract: Recent advances in technology have led to the collection of data at speeds and volumes exponentially greater than ever before. In parallel, data management tools have been developed alongside this expansion which allows those that possess these tools to more easily gather, access, process, and interpret data. In this presentation, the MTSU Data Science Institute will demonstrate the concept and execution of their Multi-Sensor Data System (MSDS) using an applied water quality dataset from an ongoing project to demonstrate its functionality. This system was designed to automate the ingestion of multiple different data files and serve up the results via an interactive dashboard in minutes. This cloud platform was built using Amazon Web Services serverless cloud technology with an emphasis on low cost and maintenance, with high performance and reliability.
Abstract: There are a wide variety of ways to approach processing and analyzing data. In this talk, Alex Antonison leads a conversation about the tools and programming languages he uses to improve productivity when building data and analytic solutions. At the end, he reviews the results of the Data Nerds Tooling survey with the group to see what tools come out on top.